cochrans_q: Cochran's Q test for comparing multiple classifiers

https://rasbt.github.io/mlxtend/user_guide/evaluate/cochrans_q/

複数のアルゴリズムを同時に比較するのに使う手法

全部同じ（帰無仮説） or 違いがある（違いがある場合は後続の検定を行う）

In a sense, Cochran's Q test is analogous to ANOVA for binary outcomes.

「ある意味で、CochranのQ検定は2値の結果についてANOVAと類似している」

Cochran's Q test tests the hypothesis that there is no difference between the classification accuracies: pi:H0=p1=p2=⋯=pL.

CochranのQ検定では、L個の分類器のaccuracy p_iに違いはないという帰無仮説H0を検定する

TODO：数式を確認したい

4.6 Cochran’s Q Test for Comparing the Performance of Multiple Classifiers

分類器はL個 → 統計量Qは自由度L-1のχ2乗分布に従う

code:example1.py

>> import numpy as np

>> from mlxtend.evaluate import cochrans_q

>> y_true = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0])

>> y_model_1 = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0])

>> y_model_2 = np.array([1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0])

>> y_model_3 = np.array([1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

... 1, 1])

>> # 有意水準 alpha = 0.05

>> q, p_value = cochrans_q(y_true, y_model_1, y_model_2, y_model_3)

>> q

7.529411764705882

>> p_value # p_value < 0.5 より「どの分類器のaccuracyも等しい」という帰無仮説は棄却される

0.023174427241061245

>> # multiple post hoc pair-wise tests へ

事後の検定の一例：McNemar tests with a Bonferroni correction（4.5 Multiple Hypotheses Testing）

let's illustrate that Cochran's Q test is indeed just a generalized version of McNemar's test:

CochranのQ検定は #マクネマー検定の一般化バージョン

mcnemar: McNemar's test for classifier comparisons（連続値の補正なし）と同じ結果になることを示している

code:cochran_q_and_mcnemar.py

>> from mlxtend.evaluate import mcnemar, mcnemar_table

>> chi2, p_value = cochrans_q(y_true, y_model_1, y_model_2)

>> chi2

5.333333333333333

>> p_value

0.020921335337794035

>> mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=False) # 同じ！

(5.333333333333333, 0.020921335337794035)

>> cochrans_q(y_true, y_model_1, y_model_2) == mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=False)

True

>> mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=True)

(4.083333333333333, 0.04330814281079206)